Goto

Collaborating Authors

 global search


Inference-time Scaling of Diffusion Models through Classical Search

Zhang, Xiangcheng, Lin, Haowei, Ye, Haotian, Zou, James, Ma, Jianzhu, Liang, Yitao, Du, Yilun

arXiv.org Machine Learning

Classical search algorithms have long underpinned modern artificial intelligence. In this work, we tackle the challenge of inference-time control in diffusion models -- adapting generated outputs to meet diverse test-time objectives -- using principles from classical search. We propose a general framework that orchestrates local and global search to efficiently navigate the generative space. It employs a theoretically grounded local search via annealed Langevin MCMC and performs compute-efficient global exploration using breadth-first and depth-first tree search. We evaluate our approach on a range of challenging domains, including planning, offline reinforcement learning, and image generation. Across all tasks, we observe significant gains in both performance and efficiency. These results show that classical search provides a principled and practical foundation for inference-time scaling in diffusion models. Project page at diffusion-inference-scaling.github.io.


A Data Efficient Framework for Learning Local Heuristics

Veerapaneni, Rishi, Park, Jonathan, Saleem, Muhammad Suhail, Likhachev, Maxim

arXiv.org Artificial Intelligence

With the advent of machine learning, there have been several recent attempts to learn effective and generalizable heuristics. Local Heuristic A* (LoHA*) is one recent method that instead of learning the entire heuristic estimate, learns a "local" residual heuristic that estimates the cost to escape a region (Veerapaneni et al 2023). LoHA*, like other supervised learning methods, collects a dataset of target values by querying an oracle on many planning problems (in this case, local planning problems). This data collection process can become slow as the size of the local region increases or if the domain requires expensive collision checks. Our main insight is that when an A* search solves a start-goal planning problem it inherently ends up solving multiple local planning problems. We exploit this observation to propose an efficient data collection framework that does <1/10th the amount of work (measured by expansions) to collect the same amount of data in comparison to baselines. This idea also enables us to run LoHA* in an online manner where we can iteratively collect data and improve our model while solving relevant start-goal tasks. We demonstrate the performance of our data collection and online framework on a 4D $(x, y, \theta, v)$ navigation domain.


OptBA: Optimizing Hyperparameters with the Bees Algorithm for Improved Medical Text Classification

Shaaban, Mai A., Kashkash, Mariam, Alghfeli, Maryam, Ibrahim, Adham

arXiv.org Artificial Intelligence

One of the challenges that artificial intelligence engineers face, specifically in the field of deep learning is obtaining the optimal model hyperparameters. The search for optimal hyperparameters usually hinders the progress of solutions to real-world problems such as healthcare. To overcome this hurdle, the proposed work introduces a novel mechanism called ``OptBA" to automatically fine-tune the hyperparameters of deep learning models by leveraging the Bees Algorithm, which is a recent promising swarm intelligence algorithm. In this paper, the optimization problem of OptBA is to maximize the accuracy in classifying ailments using medical text, where initial hyperparameters are iteratively adjusted by specific criteria. Experimental results demonstrate a noteworthy enhancement in accuracy with approximately 1.4%. This outcome highlights the effectiveness of the proposed mechanism in addressing the critical issue of hyperparameter optimization and its potential impact on advancing solutions for healthcare and other societal challenges.


Amortized Global Search for Efficient Preliminary Trajectory Design with Deep Generative Models

Li, Anjian, Sinha, Amlan, Beeson, Ryne

arXiv.org Artificial Intelligence

For example, a grid-based search is a classical approach for spacecraft preliminary trajectory design. However, this technique is more suitable for impulsive trajectory since the search space is much smaller. Due to the curse of dimensionality, low-thrust trajectory design often needs a more intelligent global search algorithm. Evolutionary algorithms, including Differential Evolution (DE) [4], Genetic algorithm (GA) [5], Particle swarm optimization (PSO) [6], etc., have been widely used in global optimization problems in spacecraft trajectory design [7, 8, 9, 10]. These algorithms iteratively generate new solutions by introducing randomness to previously obtained solutions and downselecting the solutions based on specific quality metrics. In addition, researchers also combine stochastic search algorithms with local gradient-based optimizers to attempt to find the globally optimal solution. The multistart method samples the search space with a fixed distribution and feeds the samples into a local optimizer as starting points for local search [10]. Inspired by energy minimization principles in computational chemistry, Monotonic Basin Hopping (MBH) [11, 12] adds random perturbations during the local search to uncover multiple local optima solutions that are close to each other. MBH rapidly became popular in the sphere of spacecraft trajectory design [1, 13, 14] and has been established as the state-of-the-art algorithm in terms of efficiency and solution quality through various benchmarks [15, 9, 10].


Quadratic speedup of global search using a biased crossover of two good solutions

Isomura, Takuya

arXiv.org Machine Learning

The minimisation of cost functions is crucial in various optimisation fields. However, identifying their global minimum remains challenging owing to the huge computational cost incurred. This work analytically expresses the computational cost to identify an approximate global minimum for a class of cost functions defined under a high-dimensional discrete state space. Then, we derive an optimal global search scheme that minimises the computational cost. Mathematical analyses demonstrate that a combination of the gradient descent algorithm and the selection and crossover algorithm--with a biased crossover weight--maximises the search efficiency. Remarkably, its computational cost is of the square root order in contrast to that of the conventional gradient descent algorithms, indicating a quadratic speedup of global search. We corroborate this proposition using numerical analyses of the travelling salesman problem. The simple computational architecture and minimal computational cost of the proposed scheme are highly desirable for biological organisms and neuromorphic hardware.


MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking

Wang, Xiao, Shu, Xiujun, Zhang, Shiliang, Jiang, Bo, Wang, Yaowei, Tian, Yonghong, Wu, Feng

arXiv.org Artificial Intelligence

Many RGB-T trackers attempt to attain robust feature representation by utilizing an adaptive weighting scheme (or attention mechanism). Different from these works, we propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data by adaptively adjusting the convolutional kernels for various input images in practical tracking. Given the image pairs as input, we first encode their features with the backbone network. Then, we concatenate these feature maps and generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively. Inspired by residual connection, both the generated visible and thermal feature maps will be summarized with input feature maps. The augmented feature maps will be fed into the RoI align module to generate instance-level features for subsequent classification. To address issues caused by heavy occlusion, fast motion, and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target-driven attention mechanism. The spatial and temporal recurrent neural network is used to capture the direction-aware context for accurate global attention prediction. Extensive experiments on three large-scale RGB-T tracking benchmark datasets validated the effectiveness of our proposed algorithm. The project page of this paper is available at https://sites.google.com/view/mfgrgbttrack/.


Dynamic Attention guided Multi-Trajectory Analysis for Single Object Tracking

Wang, Xiao, Chen, Zhe, Tang, Jin, Luo, Bin, Wang, Yaowei, Tian, Yonghong, Wu, Feng

arXiv.org Artificial Intelligence

Most of the existing single object trackers track the target in a unitary local search window, making them particularly vulnerable to challenging factors such as heavy occlusions and out-of-view movements. Despite the attempts to further incorporate global search, prevailing mechanisms that cooperate local and global search are relatively static, thus are still sub-optimal for improving tracking performance. By further studying the local and global search results, we raise a question: can we allow more dynamics for cooperating both results? In this paper, we propose to introduce more dynamics by devising a dynamic attention-guided multi-trajectory tracking strategy. In particular, we construct dynamic appearance model that contains multiple target templates, each of which provides its own attention for locating the target in the new frame. Guided by different attention, we maintain diversified tracking results for the target to build multi-trajectory tracking history, allowing more candidates to represent the true target trajectory. After spanning the whole sequence, we introduce a multi-trajectory selection network to find the best trajectory that delivers improved tracking performance. Extensive experimental results show that our proposed tracking strategy achieves compelling performance on various large-scale tracking benchmarks. The project page of this paper can be found at https://sites.google.com/view/mt-track/.


Learning DAGs with continuous optimization

AIHub

As datasets continually increase in size and complexity, our ability to uncover meaningful insights from unstructured and unlabeled data is crucial. At the same time, a premium has been placed on delivering simple, human-interpretable, and trustworthy inferential models of data. One promising class of such models are graphical models, which have been used to extract relational information from massive datasets arising from a wide variety of domains including biology, medicine, business, and finance, just to name a few. Graphical models are families of multivariate distributions with compact representations expressed as graphs. In both undirected (Markov networks) and directed (Bayesian networks) graphical models, the graph structure guides the factorization of the joint distribution into smaller local specifications such as clique potentials or local conditionals of a variable given its "parent" variables.


The Global Search for Education: How Building Robots Builds Confidence in Girls

#artificialintelligence

Posted By C. M. Rubin on Oct 9, 2019 "We added "Artificial Intelligence" to "Robotics & STEM" this year because it is an important and timely topic for young people to learn about." Prior to joining the Girls of Steel Robotics Program at Carnegie Mellon University's (CMU) Field Robotics Center, Theresa Richards was a science teacher in Pittsburgh where she created an award-winning lesson integrating robotics into a Human Anatomy and Physiology course. The problem her organization is trying to solve is the demand for more people in STEM, and in particular, women. A December 2018 report in Pittsburgh shows there are 80,000 STEM jobs currently available. "We believe that building robots builds confidence in STEM," says Richards.


The Global Search for Education: Who's Working on Keeping Our Data Safe?

#artificialintelligence

Posted By C. M. Rubin on Jul 18, 2018 "We're working on two areas to improve confidentiality of cloud computing." Data and the intelligence that can be gained from it is seen as a solution to solving many of the world's largest challenges, but despite the great opportunities, there are also significant risks. Data-based companies use data to make money. As computer systems become increasingly centralized and ubiquitous, the possibility for widespread security breaches becomes a risk, and we've already seen the devastating consequences, e.g. the Cambridge Analytica Scandal. Jon Crowcroft, the Marconi Professor of Communications Systems at the Alan Turing Institute – University of Cambridge, has researched these issues throughout his career in computer science.